Getting Dense Word Embeddings

導入（Word Embeddings: Encoding Lexical Semantics）の続き

how could we actually encode semantic similarity in words?

unique semantic attributeをいくつか与えて表現する（導入から続く3つ目の方法）

例：mathematicianとphysicistをベクトルで表す

どちらも走れる → is able to runというattirbuteの値はそれぞれ2.3と2.5

attributesの数の次元のベクトル

コサイン類似度が2つの語のベクトルの（意味的な）類似度となる

attributeを考えるのは大きなペイン

コサイン類似度

extremely similar words (words whose embeddings point in the same direction) will have similarity 1. Extremely dissimilar words should have similarity -1.

Central to the idea of deep learning is that the neural network learns representations of the features, rather than requiring the programmer to design them herself.

ニューラルネットワークが特徴量の表現を学習する（プログラマーが設計するのではなく）

why not just let the word embeddings be parameters in our model, and then be updated during training? This is exactly what we will do.

Note that the word embeddings will probably not be interpretable.

上で示したattirbuteをhand madeする方法と比べて解釈できない

まとめ

word embeddings are a representation of the *semantics* of a word, efficiently encoding semantic information that might be relevant to the task at hand